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1  Research  Objectives 

Complexity  theory  is  the  study  of  resource-bounded  computation.  The  aim  of  this 
project  is  to  study  the  amount  of  resources,  in  particular,  time  and  hardware,  used 
in  neural  network  computations.  Research  will  focus  on  four  major  topics: 

1.  The  relative  computing  power  of  various  neural  network  models. 

2.  Algorithms  for  neural  network  computations;  upper-bounds,  lower-bounds 
and  completeness  properties. 

3.  Fault-tolerant  computation. 

4.  Learning. 


2  Accomplishments 

The  Investigators  have  made  significant  progress  in  laying  a  foundation  for  a  com¬ 
plexity  theory  of  neural  networks.  The  complexity  class  TC°  has  been  identified 
as  the  prime  class  of  interest  for  neural  network  computation.  It  consists  of  the 
problems  which  can  be  solved  by  small,  fast  neural  networks,  that  is,  those  whose 
size  grows  only  polynomials  with  the  number  of  inputs,  and  with  a  fixed  number 
of  layers.  The  complexity  class  remains  the  same  under  many  different  neural  net¬ 
work  models,  for  example,  even  if  probabilistic  ([19]),  multi-valued  ([15]),  or  analog 
neurons  are  allowed.  Progress  has  also  been  made  with  problems  related  to  learning 
and  fault,  tolerance.  More  details  follow. 


2.1  Foundations 

The  article  by  Parberry  [IS]  has  laid  the  groundwork  for  the  study  of  the  complex¬ 
ity  of  neural  networks.  In  this  paper,  a  case  is  made  for  the  importance  of  the 
complexity  theory  of  neural  networks  by  comparing  and  contrasting  it  with  conven¬ 
tional  sequential,  parallel  and  probabilistic  complexity  theory,  and  collect  together 
much  of  the  knowledge  which  can  be  fairly  easily  deduced  from  standard  results  in 
complexity  theory.  This  will  form  the  background  against  which  our  research  will 
be  presented.  The  key  resources  of  time,  size,  (number  of. neurons)  and  weight  (total 
weight  of  all  connections)  are  identified.  The  latter  two  resources  give  some  indica¬ 
tion  of  the  amount  of  hardware  that  will  be  needed  to  implement  neural  networks. 
The  links  between  neural  network  based  complexity  classes  and  the  standard  classes 
are  explored.  There  is  no  significant  difference  between  the  two  until  running  time 
is  reduced  to  a  constant  and  hardware  to  a  polynomial.  In  this  case  (for  example, 
the  class  TC°  of  languages  which  can  be  recognized  in  polynomial  hardware  and 
constant  depth)  little  is  known. 
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Parberry  [18]  also  contains  a  few  previously  unpublished  results,  most  notably 
the  following. 

1.  The  weights  of  a  neural  network  can  be  made  ±1  by  increasing  the  size  from 
2  to  z4  log3  z  and  the  depth  by  a  constant  multiple.  This  is  a  smaller  increase 
in  size  than  previously  obtained  in  Parberry  and  Schnitger  [19]. 

2.  Any  function  which  can  be  computed  by  a  conventional1  circuit  of  size  z  and 
depth  d  can  be  computed  by  a  neural  network  of  depth  d/e  log  log  z  and  size 
0(z1+£),  for  any  e  >  0.  This  means  that  polynomial  size  neural  networks  are 
faster  than  conventional  bounded  fan-in  circuits  by  a  factor  of  log  log  n.  The 
exponent  in  our  result  is  smaller  than  the  one  previously  known. 

3.  The  NP-completeness  of  some  problems  related  to  the  termination  of  cyclic 
neural  networks  has  been  strengthened  to  some  restricted  cases,  including 
bounded  degree,  and  the  property  that  if  there  is  a  terminating  computation 
then  there  is  at  least  one  which  does  so  in  polynomial  time. 

2.2  Pebbling 

One  of  the  advantages  that  neural  networks  have  over  conventional  circuits  is  un¬ 
bounded  fan-in.  There  is  a  well-known  relationship  between  size  of  a  conventional 
circuit  and  depth  of  an  unbounded  fan-in  circuit:  any  function  computed  in  size  z 
by  the  former  can  be  computed  in  depth  0(z/ log  z)  (and  possibly  size  exponential 
in  z)  by  the  latter.  Ivalyanasundaram  and  Schnitger  [l l]  have  improved  this  result 
by  reducing  size  substantially. 

2.3  Boltzmann  Machines 

We  have  formulated  and  developed  the  thesis  that  the  class  TC°  is  fundamental  to 
neural  network  computations,  in  that  it  characterizes  the  languages  recognizable  by 
small,  fast  neural  networks.  In  Parberry  and  Schnitger  [19],  we  showed  that  this 
is  even  true  for  probabilistic  models  (such  as  the  Boltzmann  machine)  of  polyno¬ 
mial  weight.  Our  efforts  in  this  direction  have  led  to  an  interest  in  TC'°  by  many- 
prominent  complexity  theorists. 

2.4  Computing  with  Noisy  Neurons 

We  consider  the  scenario  in  which  each  neuron  has  a  small  probability  of  failing, 
and  we  wish  to  construct  a  network  which  reliably-  computes  the  correct  result  with 
probability  of  some  fixed  constant  greater  than  one  half.  The  reliable  simulation  of 
fault-free  conventional  circuits  by  a  faulty  neural  network  with  a  log-linear  increase 
in  size  and  constant  multiple  increase  in  depth  is  described  in  Parberry  [IS].  This 
can  be  extended  to  neural  networks  with  a  small  fixed  fan-in  [6].  However,  the  more 

'We  use  the  term  conventional  circuit  to  describe  a  circuit  constructed  from  two-input  NAND 
gates. 


3 


desirable  fault-tolerant  simulation  of  a  fault-free  neural  network  appears  difficult. 
There  is  evidence  from  VLSI  and  from  natural  neural  systems  that  it  is  architec¬ 
turally  advantageous  to  physically  separate  the  summation  from  the  thresholding  in 
the  neuron.  In  [6]  we  propose  such  a  model,  called  the  summation  network.  It  is  not 
too  different  from  the  standard  model,  in  the  sense  that  each  can  simulate  the  other 
in  a  fault-free  environment  with  only  a  polynomial  increase  in  size  and  a  constant 
multiple  increase  in  depth  (thus  polynomial  hardware,  constant  depth  summation 
circuits  also  recognize  TC°).  Nonetheless,  summation  networks  are  much  easier  to 
analyze  in  the  presence  of  faults.  We  were  able  to  obtain  a  reliable  simulation  of 
a  fault-free  summation  network  by  a  faulty  summation  network  with  a  log-linear 
increase  in  size  and  constant  multiple  increase  in  depth. 

2.5  Complexity  of  Approximation 

It  is  often  hoped  that  neural  networks  will  be  useful  in  solving  NP-complete  prob¬ 
lems.  It  is  apparent  from  (IS)  that  no  polynomial  size  neural  network  can  ever 
solve  such  a  problem  exactly,  and  that  it  is  easy  to  find  an  exponential  size  network 
which  does.  A  much  more  reasonable  aim  would  be  to  use  neural  networks  to  give 
approximate  solutions,  that  is,  solutions  which  are  sufficiently  close  to  the  optimum. 
Unfortunately  there  is  not  vef  available  a  well-developed  theory  of  approximation 
algorithms.  Berman  and  Schnitger  [9}  have  contributed  to  the  foundations  of  such 
a  theory  by  investigating  "approximation  complete”  problems.  Strong  evidence  is 
provided  indicating  that  Constraint  Satisfaction  problems  of  quite  simple  structure 
can  not  be  approximated  satisfactorily  in  polynomial  time.  Any  such  problem  would 
be  as  difficult  to  approximate  bv  a  neural  network  as  bv  a  conventional  computation 

[5]- 

2.6  Communication  Complexity 

If  neural  networks  are  to  be  implemented  in  VLSI,  it  is  likely  that  efficient  methods 
of  solving  problems  in  Numerical  Linear  Algebra  are  needed.  The  communication 
complexity  of  a  function  /  measures  the  communication  capacity  any  system  com¬ 
puting  f  must  provide.  In  the  design  of  VLSI  systems,  where  savings  on  the  chip 
area  and  computation  time  are  desired,  this  complexity  dictates  an  area  x  time 2 
lower  bound.  Chu  and  Schnitger  [10]  investigate  the  communication  complexity  of 
determining  whether  a  given  square  matrix  M  is  singular.  We  show  that,  for  a  x  n 
matrices  of  k- bit  integers,  the  communication  complexity  of  this  problem  is  Q[kn~). 
In  case  the  entries  of  M  are  elements  of  a  finite  field  of  size  p,  we  also  prove  the 
communication  complexity  of  this  problem  to  be  0(/i2logp).  Our  results  imply 
tight  bounds  for  a  wide  variety  of  other  problems  in  Numerical  Linear  Algebra. 
Among  those  problems  are  determining  the  rank  and  computing  the  determinant 
of  a  matrix,  as  well  as  the  computation  of  several  matrix  decompositions.  Another 
important  corollary  concerns  the  solvability  of  linear  systems.  In  this  problem  it 
has  to  be  decided  whether  a  linear  system  A.v  =  b  has  a  solution..  When  A  is  an 
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n  x  n  matrix  of  k- bit  integers  and  b  a  vector  of  n  &-bit  integers,  we  determine  its 
communication  complexity  to  be  Q(kn2). 

2.7  Learning  from  Tests  and  Counter-examples 

It  is  hoped  that  neural  networks  will  be  more  appropriate  for  learning  than  conven¬ 
tional  computer  models.  The  theory  of  learning  has  recently  been  the  subject  of 
much  interest  following  the  seminal  contributions  of  Valiant. 

Berman  and  Roos  [S]  have  extended  the  work  of  Angluin  on  learning  finite-state 
languages  to  show  that  deterministic  one-counter  languages  (a  large  subset  of  the 
context-free  languages)  can  be  learned  in  polynomial  time.  In  this  model  of  learning 
the  student  (i.e.  the  learning  algorithm)  can  test  whether  a  chosen  word  belongs  the 
the  given  language.  After  a  sequence  of  such  tests  the  student  constructs  a  machine 
consistent  with  all  tests  and  examples  collected  so  far,  and  uses  the  constructed 
machine  to  predict  the  membership  of  the  future  examples.  After  an  incorrect 
prediction  the  student  constructs  another  machine  with  the  aid  of  additional  tests. 
This  cycle  may  repeat  a  number  of  times,  but  the  number  of  wrong  predictions  (i.e. 
the  number  of  counter-examples)  and  the  total  time  used  for  internal  computations 
and  tests  is  bounded  by  a  polynomial  in  the  number  of  states  of  the  machine  which 
recognizes  the  given  language. 

While  the  algorithm  of  Angluin  always  returns  the  minimal  machine  consistent 
with  the  data,  the  algorithm  of  Berman  and  Roos  merely  constructs  an  equiva¬ 
lent  machine  with  size  which  is  polynoinially  related  to  the  minimal  one.  This  is 
unavoidable,  given  the  current  state  of  knowledge:  while  deterministic  finite  au¬ 
tomata  can  be  efficiently  minimized,  no  feasible  minimization  procedure  is  known 
for  one-counter  languages. 

2.8  Multi-valued  Neurons 

Much  experimental  neural  network  research  involves  analog  neurons,  which  input 
real  values,  and  output  real  values.  However,  whilst  the  theory  of  analog  neural 
networks  developed  to  date  uses  real  numbers,  experimental  work  is  typically  per¬ 
formed  on  digital  computers.  Surprisingly,  the  simulations  bear  out  the  theory,  even 
though  the  former  is  inherently  discrete,  and  the  latter  inherently  analog.  Thus  it 
appears  that  neural  networks  are  robust  in  terms  of  precision.  This  is  a  particularly 
important  trait,  since  it  is  impossible  to  fabricate  analog  hardware  which  has  arbi¬ 
trarily  high  precision.  In  particular,  biological  systems  perform  well  with  wetware 
which  has  analog  behavior,  but  only  limited  precision. 

The  Principal  Investigator.  Ian  Parberry,  (in  cooperation  with  his  Ph.D.  student 
Zoran  Obradovic)  undertook  to  investigate  analog  neural  networks  with  limited 
precision.  In  digital  simulations,  the  activation  levels  of  the  neurons  are  limited 
to  some  fixed  number  of  values,  k.  which  depends  on  the  particular  computer  in 
use.  The  computational  and  learning  complexity  of  limited  precision  analog  neural 
networks  was  investigated,  with  a  particular  emphasis  on  how  the  number  of  neurons 
and  running  time  scale  with  k.  as  well  as  the  size  of  the  problem  being  solved. 
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The  key  to  the  research  was  the  demonstration  in  [I5j  (also  submitted  to  Journal 
of  Computer  and  System  Sciences)  that  limited  precision  analog  neural  networks 
with  k  activation  levels  are  equivalent  to  discrete  neural  networks  with  k  levels  of 
activation,  and  k  —  1  thresholds,  as  opposed  to  the  traditional  single  one.  The 
computational  complexity  of  these  k- ary  neural  networks  was  studied  in  (15],  and 
the  learning  complexity  in  [16,17], 

The  work  in  [15]  extends  the  traditional  binary  discrete  neural  network  com¬ 
plexity  theory  (see  [IS])  to  the  new  multi-level  discrete  case.  The  reader  is  referred 
to  the  journal  papers  for  details.  One  typical  result  is  that  unlike  the  binary  and 
ternary  case,  the  threshold  values  for  the  k- ary  case  where  k  >  3  cannot  be  fixed. 
For  example,  in  the  binary  case,  the  threshold  can  be  made  0.  In  the  ternary  case, 
the  two  thresholds  can  be  made  0  and  1.  In  the  general  case,  no  fixed  thresholds 
will  suffice.  If  k  is  restricted  to  grow  only  polvnomially  with  the  size  of  the  problem 
being  solved,  then  polynomial  size,  constant  depth  A--arv  neural  networks  compute 
only  functions  from  TC°.  the  classical  complexity  class  for  binary  neural  networks. 
This  implies  that  the  superiority  of  analog  neural  networks  over  discrete  binary  ones 
can  only  confer  a  polynomial  in  size  and  a  constant  multiple  in  depth.  However, 
that-  polynomial  may  still  be  significant. 

The  work  in  [16,17]  extends  the  learning  algorithms  for  the  binary  discrete  neu¬ 
ron  to  the  A--aiy  case.  Efficient  versions  of  the  Perceptron  Learning  Algorithm  and 
Littlcstone’s  Winnow  Algorithm  are  given,  proved  correct,  and  analyzed. 

2.9  Lower  Bounds  for  Depth  3 

Ian  Parberry  and  his  student  Peiyuan  Yan  have  made  some  progress  on  lower- 
bounds.  Whilst  it  is  extremely  difficult  to  obtain  exponential  size  lower-bounds  on 
the  size  required  by  constant  depth  neural  networks  to  compute  certain  functions, 
we  have  made  some  progress  by  restricting  the  power  of  the  neurons.  We  [21] 
consider  depth  2  circuits  of  mod  -  p  and  mod  —  q  gates  augmented  with  the  limited 
use  of  AND  and  OR  gates  with  small  fan-in.  We  are  able  to  show  an  exponential 
size  lower-bound  for  certain  depth-3  circuits  of  these  gates  for  computing  Boolean 
conjunctions. 

2.10  Computing  with  Analog  Neurons 

In  [20]  Georg  Schnitger  (in  cooperation  with  Wolfgang  Manss  of  the  University  of 
Illinois  and  Eduardo  Sontag  of  Rutgers  University)  examined  the  computing  power 
of  feedforward  networks  with  sigmoid  (i.e.  smooth)  threshold  gates  for  computing 
boolean  functions. 

A  threshold  gate  with  inputs  .irj,. . . ,x„,  weights  icj , . . . , w„  and  threshold  t 
outputs  the  real  number  w,x,).  Popular  choices  for  the  gate  function  7 

include  the  bmary  threshold  function  (i.e.  7 (y)  =  1  if  y  >  0  and  7 (y)  =  0  if  y  <  0) 
and  smooth  threshold  functions  (i.e  7  is  differentiable). 

We  demonstrate,  for  a  large  class  of  smooth  gate  functions  7  (including  the 
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standard-  sigmoid  a(x)  = - - — -),  that  their  corresponding  feedforward  nets 

1  -r  exp(  —x) 

are  computationally  at  least  as  efficient  as  feedforward  nets  composed  of  binary 
threshold  gates.  Moreover,  we  exhibit  a  problem  (namely  io  decide  whether  exactly 
one  of  two  n-bit  strings  has  a  majority  of  ones)  that  can  be  solved  with  one  hidden 
layer  and  5  smooth  gates,  but  the  same  problem  cannot  be  solved  with  one  hidden 
layer  and  constantly  many  binary  threshold  gates. 

This  raises  the  question  whether  smooth  threshold  gates  give  rise  to  dramatically 
increased  computing  power  compared  with  binary  threshold  gates.  Under  quite 
liberal  assumptions  (which  are  satisfied  by  the  standard  sigmoid)  we  show  that 
a  feedforward  net  with  «  binary  inputs,  s  smooth  gates  and  d  hidden  layers  can 
be  simulated  within  the  same  number  of  layers  by  a  feedforward  net  composed  of 
0(poly(n  4-  $))  binary  threshold  gates. 

Thus,  disregarding  a  polynomial  increase  in  size,  smooth  threshold  functions 
and  binary  threshold  functions  are  computationally  equivalent.  If  we  don’t  disre¬ 
gard  polynomial  increases,  smooth  threshold  functions  are  computationally  at  least 
as  powerful  as  binary  threshold  functions  and  provabiy  more  powerful  for  certain 
problems. 

2.11  Fault  Tolerance 

With  his  student,  Mirjana  Obradovic  (who  was  supported  by  this  grant)  Piotr 
Berman  was  working  on  optimizing  threshold  gates;  i.e.  on  minimizing  the  sum 
of  the  weights  (assuming  integer  weights).  When  the  weights  are  allowed  to  be 
large  integers,  then  merely  testing  the  equivalence  of  two  gates  is  a  co-NP  complete 
problem,  hence  optimization  cannot  be  feasible.  However,  when  the  sum  of  the 
weights  of  even  one  of  the  gates  involved  in  the  equivalence  test  is  polynomial, 
then  an  equivalence  tvs')  can  in  polynomial  time  return  the  confirmation  of  the 
equivalence  or  a  counterexample.  We  have  developed  a  heuristic  which  uses  this 
equivalence  rest  as  follows.  It  maintains  a  set  of  examples  for  the  given  threshold 
gate,  a  set  of  proven  inequalities  of  the  form:  this  input  should  have  the  value 
of  the  target  function  at  least  as  high  as  that  input,  and  a  small  set  of  assumed 
inequalities.  The  goal  is  to  construct  a  linearly  ordered  list  of  combinations  of  input 
variables,  such  that  the  minimal  weight  of  of  each  input  is  equal  to  its  rank  in  the 
list.  This  work  is  currently  continued  with  two  graduate  students,  Nicol  So  and 
Ching-hoi  Sze. 

While  this  work  is  still  in  preparation,  the  partial  results  happened  to  have  very 
interesting  applications  in  the  area  of  management  of  replicated  data  bases  [13,14]. 
Here  the  subject  is  a  data  base  in  which  data  items  are  replicated  and  distributed 
between  some  number  of  sites,  which  may  improve  the  reliability  (a  failure  of  several 
data  sites  does  not  render  a  piece  of  date  unreachable)  and  access  (local  rather  than 
remote  reads).  A  static  scheme  allows  to  perform  a  database  transaction  dependent 
on  the  set  of  processors  which  can  at  a  particular  instance  of  time  communicate  with 
the  originator  of  the  transaction.  In  a  voting  scheme  the  sets  of  processors  allowed 
to  execute  are  characterized  by  a  distribution  of  votes  and  a  quorum  threshold.  We 
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have  characterized  several  important  classes  of  systems  in  which  voting  schemes 
provide  the  optimal  static  scheme.  Moreover,  we  introduced  efficient  and  practical 
algorithms  to  compute  the  optimal  distribution  of  votes.  A  part  of  our  technique  is 
an  efficient  test  for  the  equivalence  of  threshold  gates. 

With  his  former  student,  Juan  A.  Garay  (now  at  IBM  T.J.  Watson  Research 
Center)  Berman  continued  investigations  on  the  Distributed  Consensus  problem. 
In  this  problem  a  group  of  processors  has  the  task  of  reaching  a  common  decision. 
Each  processor  has  its  initial  option  (typically,  a  0/1  value)  and  the  common  de¬ 
cision  must  be  consistent  with  the  initial  option  of  one  of  the  processors.  There 
are  two  complications  which  make  this  problem  non-trivial:  the  communication  is 
conducted  via  bilateral  links  (so  no  ’public’  vote  is  possible)  and  some  of  the  pro¬ 
cessors  are  faulty.  No  assumptions  whatsoever  are  placed  on  the  behavior  of  the 
fault}'  processors,  e.g.  they  could  be  controlled  by  an  omniscient  adversary. 

The  goal  of  our  research  was  to  provide  solutions  with  better  quality  parameters 
than  the  previous  ones.  The  parameters  which  we  study  are  the  following:  the 
resiliency,  i.c.  the  tolerated  number  of  faulty  processors,  the  number  of  commu¬ 
nication  rounds  and  the  amount  of  communication.  So  far,  we  do  not  know  any 
solution  which  would  be  superior  simultaneously  in  all  these  aspects.  We  found  a 
solution  which  uses  1  bit  messages,  and  has  asymptotically  optimal  resiliency  (3/4 
of  the  optimum)  and  number  of  rounds  (2  times  optimum).  In  collaboration  with 
K.J.  Perry  of  IBM  Watson  we  found  a  solution  which  has  optimal  resiliency,  while 
the  message  size  is  limited  to  2  bits  and  the  number  of  exchange  rounds  is  3  times 
larger  than  the  optimal  one.  In  both  cases  we  can  substantially  reduce  the  number 
of  rounds  by  increasing  message  size  to  a  higher  constant  (this  is  quite  important 
in  practice,  since  the  cost  of  sending  one-page  message  and  one-bit  message  is  usu¬ 
ally  the  same).  Both  protocols  have  the  form  of  a  simple  sequence  of  votes,  in  the 
second  protocol  there  is  a  possibility  of  casting  an  undecided  vote  (hence  2  bits  in 
a  message,  rather  then  1).  These  results  and  their  applications  are  the  subject  of 
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Another  group  of  results  concerned  protocols  with  optimal  (rather  than  near 
optimal)  number  of  rounds  and  relatively  small  (so-called  polynomial)  message  size. 
One  of  these  results  was  presented  at  FOCS  and  is  the  subject  of  the  paper  invited 
to  a  Special  Issue  of  the  journal  Mathematical  Systems  Theory  (2).  Another  is  the 
subject  of  (l).  While  these  results  are  also  based  on  voting,  the  votes  are  nested 
recursively,  which  could  easily  lead  to  huge  message  size  The  techniques  developed 
by  Berman  and  Garay  allow  a  processor  to  avoid  participation  in  most  of  possible 
votes,  hereby  reducing  the  message  size.  In  particular,  a  set  of  rules  was  found 
which  allow  to  identify  quickly  the  faulty  processors  that  "harm"  the  computation 
and  to  deduce  the  outcomes  of  avoidable  votes. 

The  experience  gained  in  the  work  on  Distributed  Agreement  allowed  us  to 
obtain  some  interesting  results  on  fault  diagnosis  for  multiprocessor  distributed 
systems  (in  cooperation  with  Andrzej  Pelc  of  the  University  of  Quebec  [7]).  In  the 
fault  diagnosis  model  we  assume  that  the  faulty  processors  compute  unreliably,  and 
can  alter  the  content  of  transmitted  messages,  however  they  can  be  detected  by 
their  network  neighbors  with  some  probability:  moreover  faulty  processors  form  a 
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random  subset  of  the  system.  The  previous  diagnosis  technique  was  based  on  a 
simple  threshold:  the  processors  are  diagnosed  to  be  faulty  based  on  the  number  of 
“failed  tests”  (a  good  processor  may  fail  a  test,  if  the  latter  is  “administered"  by  a 
faulty  one).  We  have  shown  that  the  quality  of  diagnosis  improves  substantially  if 
we  form  a  graph  of  processors,  and  solve  a  maximum  independent  set  problem  for 
this  graph  (an  arc  is  introduced  between  two  processors  whenever  one  of  them  claims 
that  the  other  has  failed  its  test).  While  the  maximum  independent  set  problem  is 
in  general  not  feasible,  we  have  shown  that  it  suffices  to  form  a  collection  of  very 
small  graphs,  and  tackle  them  separatedly.  Moreover,  we  have  exhibited  a  scheme 
which  allows  to  distribute  the  test  result  reliably  through  the  system  even  with 
a  very  small  number  of  connections  (if  we  have  n  processors,  then  the  number  of 
links  and  tests  is  of  the  order  nlogn,  we  have  proven  that  this  order  of  growth  is 
sufficient  and  necessary). 
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